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This paper examines the accuracy of corruption perceptions by comparing Indonesian villagers' reported 
perceptions about corruption in a road-building project in their village with a more objective measure of 
‘missing expenditures’ in the project. I find that villagers' reported perceptions do contain real information, and 
that villagers are sophisticated enough to distinguish between corruption in a particular road project and 
general corruption in the village. The magnitude of the reported information, however, is small, in part because 
officials hide corruption where it is hardest for villagers to detect. I also find that there are biases in reported 
perceptions. The findings illustrate the limitations of relying solely on corruption perceptions, whether in 
designing anti-corruption policies or in conducting empirical research on corruption. 
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1. Introduction 

Corruption is thought to be a significant problem in much of the 
developing world. Corruption not only imposes a tax on public services 
and private sector activity; it also creates potentially severe efficiency 
consequences as well (Krueger, 1974; Shleifer and Vishny, 1993; Bertrand 
et al M 2006). Yet despite the importance of the problem, eliminating 
corruption has proved difficult in all but a few developing countries. 

One potential reason why corruption is so persistent is that citizens 
may not have accurate information about corruption. After all, since 
corruption is illegal, regularly and directly observing corrupt activity is 
almost always impossible. If citizens have accurate information about 
corruption, then the democratic process and grass-roots monitoring 
can potentially provide incentives for politicians to limit corruption. 
If, on the other hand, citizens have little in the way of accurate in¬ 
formation about corrupt activity — or even if citizens know about 
average levels of corruption but do not know who is corrupt and who 
is honest — then the political process may not provide sufficient 
incentives to restrain corruption. 

The accuracy of corruption perceptions is also important because 
of their ubiquitous use by international institutions and academics to 
measure corrupt activity. For example, corruption perceptions form 
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the basis of the much-cited cross-country Transparency International 
Corruption Index (Lambsdorff, 2003) and World Bank Governance 
Indicators (Kaufmann et al., 2005), and are used extensively within 
countries as well to assess governance at the sub-national level. 
Perceptions have also been widely used in academic research on the 
determinants of corruption. 1 Measuring perceptions about corruption 
rather than corruption itself skirts the inherent difficulties involved in 
measuring corruption directly, but raises the question of how those 
being surveyed form their perceptions in the first place, and how 
accurate those reported perceptions actually are. 

This paper examines the empirical relationship between reported 
corruption perceptions and a more objective measure of corruption, in 
the context of a road-building program in rural Indonesia. To construct 
an objective measure of corruption, I assembled a team of engineers 
and surveyors who, after the roads built by the project were com¬ 
pleted, dug core samples in each road to estimate the quantity of 
materials used, surveyed local suppliers to estimate prices, and inter¬ 
viewed villagers to determine the wages paid on the project. From 
these data, I construct an independent estimate of the amount each 
road actually cost to build, and then compare this estimate to what the 
village reported it spent on the project on a line-item by line-item 
basis. The difference between what the village claimed the road cost to 
build and what the engineers estimated it actually cost to build forms 
my objective measure of corruption, which I label ‘missing expendi¬ 
tures.’ To obtain data on villagers’ reported perceptions of corruption, 
in the same set of villages I also conducted a household survey, in 
which villagers were asked about the likelihood of corruption in the 
road project. 


1 Prominent papers in this literature include Mauro (1995), Knack and Keefer 
(1995), LaPorta et al. (1999), and Treisman (2000). This literature is surveyed in detail 
in Rose-Ackerman (2004). 
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Using these data, I find that villagers’ reported perceptions of the 
likelihood of corruption in the road project do contain information about 
the level of missing expenditures in the project. Moreover, villagers are 
sophisticated enough in their reported perceptions to distinguish be¬ 
tween general levels of corruption in the village and corruption in the 
particular road project I examine. However, reported perceptions of 
corruption contain only a limited amount of information: increasing the 
missing expenditures measure by 10% is associated with just a 0.8% 
increase in the probability a villager believes that there is any corruption 
in the project. 

One reason villagers’ information about corruption may be limited is 
that officials have multiple methods of hiding corruption, and choose to 
hide corruption in the places where it is hardest for villagers to detect. In 
particular, my analysis suggests that villagers are able to detect marked- 
up prices, but appear unable to detect inflated quantities of materials 
used in the road project. Consistent with this, the vast majority of 
corruption in the project occurs by inflating quantities, with almost no 
markup of prices on average. The inability of villagers to detect inflated 
quantities, combined with the fact that officials can substitute between 
hiding corruption as inflated prices or inflated quantities, suggests that 
officials may be strategic in how they hide corruption, and that effective 
monitoring requires specialist auditors who can detect multiple types of 
corruption. 

The fact that the overall correlation between reported corruption 
perceptions and missing expenditures is positive, however, is not suf¬ 
ficient to show that the two variables can be used interchangeably as 
measures of corruption. In particular, reported perceptions may be 
systematically biased, either because individuals’ beliefs are biased, or 
because conditional on their true beliefs the way individuals report 
corruption is biased. I first show that, even controlling for village fixed 
effects (and therefore controlling completely flexibly for the actual 
level of corruption in the road) and benchmarking for how respon¬ 
dents answer the corruption question in other contexts, individual 
characteristics such as education and gender systematically predict 
respondents' reported perceptions of corruption in the road project. I 
show that these biases are not affected by how the respondents are told 
the information will be used, which suggests they may be biases in the 


respondents’ underlying beliefs rather than simply biases in how re¬ 
spondents choose to report their perceptions in the survey. 

Just because individual perceptions are biased does not necessarily 
mean that, in aggregate, corruption perceptions will give misleading 
results when investigating the determinants of corruption. To test for 
aggregate biases that would affect inference about the determinants 
of corruption, I examine the relationship between the two different 
measures of corruption and a host of village characteristics. Consistent 
with other studies, I find, for example, that increased ethnic hetero¬ 
geneity is associated with higher levels of reported corruption per¬ 
ceptions (e.g., Mauro, 1995; LaPorta et al., 1999), and that increased 
levels of participation in social activities are associated with lower 
levels of reported corruption perceptions (e.g., Putnam et al., 1993). But 
when I examine the relationship between these variables and the 
missing expenditures variable, I find different results — ethnic 
heterogeneity is associated with lower levels of missing expenditures, 
and participation in social activities is not correlated with missing 
expenditures levels at all. 

One hypothesis that could reconcile these differences is that there 
may be a feedback mechanism, where biased beliefs about corruption 
lead to more monitoring behavior, which in turn lowers actual cor¬ 
ruption. For example, I show that within a given village, respondents 
who are prone to believe there is more corruption generally (as mea¬ 
sured by their corruption perceptions about the President of Indonesia) 
are more likely to engage in monitoring the village road project. Sim¬ 
ilarly, villagers in more ethnically heterogeneous villages are less likely 
to report trusting their fellow villagers, and more likely to attend project 
monitoring meetings, than those in homogeneous villages, which may 
explain why there is greater perceived corruption in heterogeneous 
villages but lower missing expenditures. 

More generally, the results suggest that when examining the cor¬ 
relates of corruption, examining perceptions of corruption may lead to 
misleading conclusions. Instead, more objective methods of measuring 
corruption, such as the approach used here (or the related approaches 
used by Di Telia and Schargrodsky, 2003; Reinikka and Svensson, 2004; 
Fisman and Wei, 2004; Yang, 2004; Hsieh and Moretti, 2006; Olken 
2006a), may produce more reliable results. 



Legend 
Missing Expenditures 

I I <= 0 

I | 0.01 - 0.05 

■ 0.06 -0.10 
H0.11 -0.15 
■ 0.16-0.20 

■ 0.21 - 0.25 

■ 0.26 - 0.30 

■ 0.31 - 0.35 

■ 0.36 - 0.40 

■ >0.40 


Notes: Each dot shows the location of a study subdistrict. The average level of missing expenditures in the district is indicated by the shading of the district. The major 
cities of East and Central Java are labeled. 


Fig. 1. Map of study area. 
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This paper is related to several literatures in economics that seek 
to characterize the relationship between reported beliefs and reality 
more generally. Bertrand and Mullainathan (2001) discuss the psy¬ 
chological underpinnings of biases in answers to subjective survey 
questions, and there is a large literature examining the accuracy and 
potential biases in individuals’ forecasts of their own future retire¬ 
ment decisions, mortality, and income. 2 In the public sphere, several 
authors have also found that reported perceptions are positively cor¬ 
related with more objective measures of performance, in the very 
different contexts of international perceptions of bribery (Mocan, 
2004), prices paid by Bolivian hospitals for medical supplies (Gray- 
Molina et al., 2001), and principals evaluating teachers (Jacob and 
Lefgren, 2005). In the setting closest to that examined here, however, 
Beaman et al. (2008) document that women leaders in Indian villages 
deliver better public services than male leaders, yet score worse on 
measures of citizen satisfaction. Their results, consistent with the 
results presented here, suggest that there may be political market 
failures caused by inaccuracies in public perceptions about the per¬ 
formance of government officials. 

The remainder of this paper is organized as follows. Section 2 
discusses the empirical setting and the data used in the paper. Section 3 
examines the degree to which individual villagers have information 
about actual corruption levels. Section 4 examines the degree to which 
villagers’ reported perceptions about corruption are biased. Section 5 
concludes. 

2. Setting and data 

2.1 Empirical setting 

The data in this paper come from 477 villages in two of Indonesia’s 
most populous provinces, East Java and Central Java, as shown in 
Fig. 1. The villages in this study were selected because they were about 
to begin building small-scale road projects under the auspices of the 
Kecamatan (Subdistrict) Development Project, or KDP. KDP is a national 
government program, funded through a loan from the World Bank, 
which finances projects in approximately 15,000 villages throughout 
Indonesia each year. The data in this paper were collected between 
September 2003 and August 2004. 

The roads I examine are built of a mixture of rock, sand, and gravel, 
range in length from 0.5-3 km, and may either run within the village 
or run from the village to the fields. A typical road project costs on 
the order of Rp. 80 million (US$8,800 at the then-current exchange 
rate). Under KDP, a village committee receives the funds from the 
central government, and then procures materials and hires labor di¬ 
rectly, rather than using a contractor as an intermediary. The allo¬ 
cation to the village is lump-sum, so that the village is the residual 
claimant. In particular, surplus funds can be used, with the approval of 
a village meeting, for additional development projects, rather than 
having to be returned to the KDP program. These funds are often 
supplemented by voluntary contributions from village residents, pri¬ 
marily in the form of unpaid labor. A series of three village-level 
meetings are conducted to monitor the use of funds by the village 
committee implementing the project. 

Corruption in the village projects can occur in several ways. First, 
village implementation teams, potentially working with the village 
head, may collude with suppliers to inflate either the prices or the 
quantities listed on the official receipts. Second, members of the im¬ 


2 For example, Bernheim (1989) discusses systematic variability in individual 
accuracy in forecasting retirement dates, Hurd and McGarry (1995) document that 
individuals with certain observable characteristics are systematically more likely to 
over or under-predict their own mortality, Dominitz and Manski (1997) document that 
individuals can forecast their expected income, and Bassett and Lumsdaine (1999, 
2001) discuss how even controlling for observable characteristics, some individuals are 
likely to be over-optimistic across a wide variety of beliefs whereas others are 
systematically over-pessimistic. 


plementation team may manipulate wage payments by inflating the 
wage rate or the number of workers paid by the project. 

The villages in this study were part of a randomized experiment on 
reducing corruption, described in more detail in Olken (2007). Three 
experimental treatments were conducted in randomly selected sub¬ 
sets of villages: an increase in the probability of an external govern¬ 
ment audit of the project, an increase in the number of invitations 
distributed to the village meetings regularly held to oversee use of 
project funds, and the distribution of anonymous comment forms. 
All of the empirical specifications reported below include dummy 
variables for each of these experimental treatments to ensure that 
the effects reported here are not being driven by these experiments, 
though the results below are essentially similar if the experimental 
dummies are not included. (I discuss the effects of the experiments on 
reported corruption perceptions in Section 3). 

The data used here come from three surveys designed by the 
author: a household survey, containing data on household beliefs 
about corruption in the project; a field survey, used to measure mis¬ 
sing expenditures in the road project; and a key-informant survey 
with the village head and the head of each hamlet, used to measure 
village characteristics. In the subsequent subsections, I describe the 
two aspects of the data that are the focus of this study — the house¬ 
hold survey on corruption perceptions and the field survey to measure 
missing expenditures in the road project. Additional details about the 
data collected can be found in Appendix A. 

2.2. Corruption perceptions 

Data on reported corruption perceptions were obtained from a 
survey of a stratified random sample of adults in the village. The 
survey was conducted between February 2004 and April 2004, when 
construction of the road projects was between 80% and 100% com¬ 
plete. The sample includes 3691 respondents. 

The key corruption question I examine is the following: “Generally 
speaking, what is your opinion of the likelihood of diversions of money/ 
I<KN (corruption, collusion, and nepotism) involving [...],” where [...] 
is 1) the President of Indonesia (at the time, Megawati Sukarnoputri), 2) 
the staff of the subdistrict office (the administrative level above the 
village), 3) the village head, 4) the village parliament, and 5) the road 
project. KKN is the Indonesian acronym for corruption, collusion, and 
nepotism — the catch-all phrase for corruption in Indonesian. Re¬ 
spondents were given 5 possible choices in response — none, low, 
medium, high, and very high. The first four questions (from the Pres¬ 
ident to the village parliament) were asked, in that order, in the middle 
of the 1.5 h survey; the question about the road project was asked 
towards the end of the survey. 

The tabulations of the responses to these corruption questions are 
given in Table 1. Several things are worth noting about the responses. 
First, the more ‘local’ the subject being asked about, the less cor¬ 
ruption respondents report — i.e., respondents report the highest 
corruption levels for the President, followed by the subdistrict staff, 
followed by the village officials, followed lastly by the road project. 

Second, 8.9% of respondents do not answer the question about 
corruption in the road project, claiming either they do not know or 
they do not want to answer. In interviews it appeared that many 
people who refused to answer did so because they felt uncomfortable 
saying that there was corruption. Although respondents were assured 
that responses would remain anonymous, this reluctance to state 
opinions about corruption is common to many surveys of corruption 
(Azfar and Murrell, 2005). It is particularly understandable in this 
context, given that free speech was restricted in Indonesia until the 
end of the Soeharto government in 1998, and that even now village 
heads still wield considerable local authority. 

I therefore examine two versions of the corruption beliefs variable 
that deal with these non-responses in different ways. The first version 
is simply the five ordered categorical responses shown in Table 1, 
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Table 1 

Summary statistics. 


Panel A: corruption perceptions 






Perceived corruption involving 

Road project 

President 

Subdistrict staff 

Village head 

Village parliament 

None 

64.1% 

13.9% 

22.1% 

47.1% 

52.4% 

Low 

21.1% 

12.8% 

15.6% 

18.0% 

14.5% 

Medium 

5.3% 

22.9% 

14.5% 

9.5% 

6.4% 

High 

0.4% 

9.2% 

1.6% 

1.8% 

0.7% 

Very high 

0.2% 

3.4% 

0.2% 

0.3% 

0.1% 

Refused to answer 

8.9% 

37.7% 

45.9% 

23.3% 

25.9% 

Num obs. 

3691 

3691 

3691 

3691 

3691 

Panel B: other variables 

Mean 

Std. Dev. 

Min 

Max 

Num obs. 

Missing expenditures 

0.237 

0.343 

-1.103 

1.674 

477 

Missing expenditures 

0.243 

0.320 

-1.287 

1.288 

477 

Missing quantities 

-0.014 

0.210 

-1.031 

0.783 

477 

Inflated prices 

-0.022 

0.205 

-1.051 

0.451 

461 

Inflated prices — no project suppliers 

0.046 

0.258 

-0.941 

1.076 

427 

Inflated prices — buyers only 

0.237 

0.343 

-1.103 

1.674 

477 

Missing expenditures for materials only 

Missing expenditures 

0.203 

0.395 

-1.255 

1.878 

477 

Missing quantities 

0.228 

0.353 

-1.355 

1.878 

477 

Inflated prices 

-0.026 

0.240 

-1.031 

0.832 

476 

Inflated prices — no project suppliers 

-0.043 

0.235 

-1.051 

0.529 

438 

Inflated prices — buyers only 

0.002 

0.250 

-0.941 

0.783 

211 

Household covariates 

Education (years) 

7.340 

3.238 

0 

18 

3686 

Age 

41.063 

11.693 

18 

90 

3691 

Female 

0.302 

0.459 

0 

1 

3691 

Predicted log per-capita consumption 

11.473 

0.284 

10.620 

12.898 

3487 

Participation in social activities (number of times in last 3 months) 

22.449 

20.159 

0 

162 

3691 

Participation in social activities in last 3 months where road project likely discussed 

6.801 

5.907 

0 

55.389 

3472 

Lives in project hamlet 

0.553 

0.497 

0 

1 

3691 

Attended development meeting 

0.260 

0.439 

0 

1 

3662 

Family member of village government 

0.301 

0.459 

0 

1 

3691 

Family member of road project leader 

0.058 

0.234 

0 

1 

3691 

Version B of survey form 

0.337 

0.473 

0 

1 

3667 

Village covariates 

Log population 

8.209 

0.562 

6.347 

10.096 

477 

Mean village education level (years) 

4.257 

1.082 

1.061 

7.806 

472 

Share of population poor 

0.407 

0.212 

0.019 

0.945 

474 

Ethnic fragmentation 

0.031 

0.085 

0.000 

0.513 

472 

Religious fragmentation 

0.020 

0.047 

0.000 

0.424 

472 

Intensity of social participation 

11.042 

11.680 

0.000 

87.875 

459 

Meetings with written accountability report 

0.328 

0.383 

0.000 

1.000 

470 

Number of ordinances from village parliament 

3.981 

3.157 

0.000 

22.000 

471 


Notes: For perceived corruption, the figures given are percentage responses to the question “In general, what is your opinion of the likelihood of corruption/KKN (corruption, collusion, 
nepotism) involving [...]?” where [...] is the President of Indonesia (Megawati Sukarnoputri), the staff of the subdistrict, the village head in the respondent's village, the village parliament, 
or the road project, as indicated in the columns. Sample is limited to those villages where the missing expenditures variable is not missing. 


where “refused to answer” is treated as missing. I use ordered probit 
models to investigate the determinants of this categorical response 
variable. The disadvantage of this approach is that it disregards the 
potentially useful information contained in “refused to answer,” namely 
that those who refuse to answer often believe there is corruption but 
are unwilling to say so. I therefore create a second version of the beliefs 
variable called “any likelihood of corruption” that groups all positive 
likelihood of corruption answers together with non-responses. This 
variable is equal to 1 if the respondent reports any positive probability 
of corruption (low, medium, high, or very high) or refused to answer 
the corruption question, and 0 otherwise. 3 I use probit models to 
investigate the determinants of this variable. As will be discussed in 
more detail below, the two variables produce broadly similar results. 

2.3. Missing expenditures 

The independent measure of corruption I use is “missing expenditures” 
in the road project. Missing expenditures are the difference in logs 


3 Alternatively, if I use a dummy variable for any positive perceptions of corruption, 
but drop missings rather than count them as a positive perception, the results are 
slightly weaker than the results presented. This is consistent with the idea that a non¬ 
response is associated with a positive perceived corruption probability. 


between what the village claimed it spent on the project and an inde¬ 
pendent estimate of what it actually spent. This measure is approximately 
equal to the percent of expenditures on the road project that cannot be 
accounted for by the independent estimate of expenditures. 

Obtaining data on what villages claim they spent is relatively 
straightforward. At the end of the project, all village implementation 
teams were required by KDP to file an accountability report with the 
project subdistrict office, in which they reported the prices, quantities, 
and total expenditure on each type of material and each type of labor 
(skilled, unskilled, and foreman) used in the project. The total amount 
reported must match the total amount allocated to the village. These 
reports were obtained from the village by the survey team. 

Obtaining an independent estimate of what was actually spent was 
substantially more difficult, and involved three main activities — an 
engineering survey to determine quantities of materials used, a worker 
survey to determine wages paid by the project, and a supplier survey to 
determine prices for materials. In the engineering survey, an engineer 
and an assistant conducted a detailed physical assessment of all physical 
infrastructure built by the project in order to obtain an estimate of the 
quantity of main materials (rocks, sand, and gravel) used. In particular, 
to estimate the quantity of each of these materials used in the road, the 
engineers dug ten 40 cmx40 cm core samples at randomly selected 
locations on the road and measured the quantities of each material in 
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Log(Reported) Log(Actual) 

- Missing Expenditures - Inflated Prices 

- Missing Quantities 


Notes: The graph shows kernel density estimates of the PDFs of the missing 
expenditures, missing quantities, and inflated prices variables. 

Fig. 2. Distributions of missing expenditures. 

each core sample. By combining the measurements of the volume of 
each material per square meter of road with measurements of the total 
length and average width of the road, I can estimate the total quantity of 
materials used in the road. I also conducted calibration exercises to 
estimate a “loss ratio,” i.e., the fraction of materials that are typically lost 
are lost as part of the normal construction process. 4 

To measure the quantity of labor, workers were asked which of the 
many activities involved in building the road were done with paid 
labor, voluntary labor, or some combination, what the daily wage and 
number of hours worked was, and to describe any piece rate arrange¬ 
ments that may have been part of the building of the project. To 
estimate the quantity of person-days actually paid out by the project, I 
combine information from the worker survey about the percentage of 
each task done with paid labor, information from the engineering 
survey about the quantity of each task, and assumptions of worker 
capacity derived both from the experience of field engineers and the 
experience from building the calibration roads. 

To measure prices, a price survey was conducted in each subdistrict. 
Since there can be substantial differences in transportation costs within 
a subdistrict, surveyors obtained prices for each material that included 
transportation costs to each survey village. The price survey included 
several types of suppliers — supply contractors, construction supply 
stores, truck drivers (who typically transport the materials used in the 
project), and workers at quarries — as well recent buyers of material 
(primarily workers at construction sites). For each type of material used 
by the project, between three and five independent prices were 
obtained; I use the median price from the survey for the analysis. To 
minimize the potential for reporting bias, in all cases price surveys were 
conducted in villages in the subdistrict other than the village for which 
the data would be used. Respondents were also not informed that the 
survey was related to an analysis of the road project. 5 


4 For example, some amount of sand may blow away off the top of a truck, or may 
not be totally scooped out of the hole dug by the engineers conducting the core 
sample. I estimated the ratio between actual materials used and the amount of 
materials measured by the engineering survey by constructing four test roads, where 
the quantities of materials were measured both before and after construction. In 
calculating missing expenditures, I multiply the estimated actual quantities based on 
the core samples of the road by this loss ratio to generate the actual estimated level of 
expenditures on the road project. 

5 As with quantities, the “zero corruption” level of the differences in prices might not 
be 0; for example, villages might be able to obtain discounts beyond those our 
surveyors could obtain. However, it is hard to know what these discounts might be, so I 
do not have a way of calibrating the analogous “loss ratio” for prices as I did for 
quantities. 


From these data — reported and actual quantities and prices for 
each of the major items used in the project — I construct the missing 
expenditures variable. Specifically, I define the missing expenditures 
variable to be the difference between the log of the reported amount 
and the log of the actual amount. As shown in Table 1, on average, after 
adjusting for the normal loss ratios derived from the calibration 
exercise, the mean of the missing expenditures variable is 0.24. Note, 
however, that while the levels of the missing expenditures variable 
depend on the loss ratios, the differences in missing expenditures 
across different villages do not. 6 As a result, I focus primarily on the 
differences in missing expenditures across villages rather than on the 
absolute level of missing expenditures. I also examine several alter¬ 
native versions of the missing expenditures measures, which separ¬ 
ate out missing price and quantities, focus on missing materials 
expenditures only (i.e., exclude labor), and use various subsets of 
respondents from the price survey. The mean levels of missing ex¬ 
penditures for each district in the study are shown in Fig. 1, and the 
PDFs of the missing expenditures, inflated prices and missing quan¬ 
tities variables are shown in Fig. 2. 

3. Comparing perceptions with missing expenditures 

3.1 The information content of villagers' reported perceptions 

I begin by estimating whether villagers’ reported corruption 
perceptions contain any information about missing expenditures. I 
consider both versions of the corruption perceptions variable des¬ 
cribed above — the categorical response variable and a dummy 
variable for any positive probability of corruption in the road project 
(including missings as positive responses). I estimate an ordered 
probit model of the following form: 

p(Pvh =j) = &($ - (ic v - X'^y) - <p(o j _- l - Pc v - X' vh y) (1) 

where P is the respondent’s answer to the question about perceptions 
of corruption in the road project, c is the estimate of missing 
expenditures in the road project, v represents a village, h represents a 
household, j is one of the J categorical answers to the corruption 
perception question, 6j is a cutoff point estimated by the model (with 
0o = — 00 and 6j = o °), X vh are dummies for how the household was 
sampled, which version of the form the respondent received, and the 
experimental treatments, and 0 is the Normal CDF. The test of 
whether individuals’ corruption perceptions have information is a test 
of whether the coefficient /3>0. For the dummy variable version of the 
perceptions variable, I estimate the equivalent probit equation (i.e., 
with only one threshold level Of. Standard errors are adjusted for 
clustering at the subdistrict level, to take into account the fact that 
there are multiple respondents h in a single village v and that the 
missing expenditures variable may be correlated across villages in a 
given subdistrict. 7 

The results are presented in columns (1) and (4) of Table 2 for the 
categorical and dummy variables, respectively. Note that to facilitate 
interpretation, for the probit specification in column (4) I present 


6 To see this, note that the loss ratio is a multiplicative constant for each component 
of the road. If there was only one type of material used the project, then since missing 
expenditures are expressed as the differences in logs, the loss ratio is simply an 
additive constant. With multiple components (e.g., rocks, sand, gravel, etc), the 
additive constant varies slightly from village to village, depending on the relative 
weights of the different components in different villages. These differences are small, 
however, so that changes in the loss ratios do not substantively affect the results. 

7 There are 143 subdistricts in the sample. One subdistrict therefore includes an 
average of 3.3 villages, so clustering at the subdistrict is more conservative than 
clustering at the village level. Clustering at the village level reduces the standard errors 
from those presented in the table. 
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Table 2 

Relationship between perceptions and missing expenditures. 



(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

Likelihood of corruption in 
road project (ordered probit) 

Any likelihood of corruption 
in road project (dummy 
variable 0-1, probit 
marginal effects) 

Missing expenditures 

0.186 

0.280* 

0.307** 

0.097 

0.119** 

0.123*** 


(0.175) 

(0.167) 

(0.135) 

(0.060) 

(0.057) 

(0.047) 

Corruption perceptions of 






President — low 


0.726*** 

0.130 


0.214*** 

-0.011 



(0.119) 

(0.146) 


(0.045) 

(0.052) 

President — medium 


1.018*** 

0.253* 


0.320*** 

0.042 



(0.132) 

(0.134) 


(0.043) 

(0.042) 

President — high 


1.180*** 

0.423*** 


0.365*** 

0.091* 



(0.163) 

(0.149) 


(0.057) 

(0.054) 

President — very high 


1.190*** 

0.305 


0.282*** 

-0.019 



(0.299) 

(0.279) 


(0.103) 

(0.093) 

President — refused 


0.432*** 

-0.080 


0.155*** 

-0.055 

to answer 


(0.141) 

(0.134) 


(0.042) 

(0.044) 

Subdistrict official — 



0.294** 



0.129*** 

low 



(0.119) 



(0.047) 

Subdistrict official — 



0.277** 



0.114** 

medium 



(0.133) 



(0.052) 

Subdistrict official — 



0.512* 



0.238* 

high 



(0.306) 



(0.143) 

Subdistrict official — 



0.744 



-0.084 

very high 



(0.656) 



(0.181) 

Subdistrict official — 



-0.046 



0.032 

refused to answer 



(0.110) 



(0.039) 

Village head — low 



0.495*** 



0.205*** 




(0.096) 



(0.040) 

Village head — 



0.762*** 



0.260*** 

medium 



(0.150) 



(0.054) 

Village head — high 



0.590** 



0.085 




(0.285) 



(0.116) 

Village head — 



1.920*** 



0.650*** 

very high 



(0.438) 



(0.044) 

Village head — 



0.302* 



0.102** 

refused to answer 



(0.159) 



(0.050) 

Village parliament — 



0.199* 



0.048 

low 



(0.113) 



(0.047) 

Village parliament — 



0.311 



0.094 

medium 



(0.213) 



(0.086) 

Village parliament — 



0.595 



0.210 

high 



(0.374) 



(0.154) 

Village parliament — 



-0.398 



-0.153 

very high 



(0.750) 



(0.144) 

Village parliament — 



0.501*** 



Q yj\ *** 

refused to answer 



(0.152) 



(0.053) 

Respondent covariates 

No 

No 

Yes 

No 

No 

Yes 

Sample controls 

Yes 

Yes 

Yes 

Yes 

Yes 

Yes 

Observations 

3314 

3314 

2931 

3639 

3639 

3226 

Mean dep. var 




0.36 

0.36 

0.35 


Notes: Robust standard errors in parentheses, clustered at the subdistrict level. In 
columns (1 )-(3), the dependent variable is the categorical responses to the perceptions 
question, i.e., ‘none’, ‘low’, ‘medium’, ‘high’ and ‘very high’ (in that order). In columns 
(4)-(6), the dependent variable is a dummy that takes value 0 if answer was ‘none’ and 
1 if answer was ‘low’, ‘medium’, ‘high’, ‘very high’, or if the respondent refused to 
answer. Corruption perceptions of President, subdistrict official, village head, and 
village parliament are dummies for respondent's perceived corruption levels of the 
respective officials. Respondent covariates are age, education, gender, predicted per- 
capita expenditure, participation in social activities, relationship to government and 
project officials. Sample controls are dummies for the three experimental interventions 
(audit, invitations, and invitations + comment forms), dummies for the different strata 
of respondents sampled, and a dummy for which version of the form the respondent 
received. 

*Significant at 10%; **Significant at 5%; ***Significant at 1%. 


marginal effects. Both results show a positive coefficient on the mis¬ 
sing expenditures variable, though neither coefficient is statistically 
significant. 

A respondent’s answers about a particular type of corruption may be 
colored by the respondent's attitudes about corruption in general. The 
responses to the corruption question may also differ if individuals 


perceive the levels of the scale (i.e., ‘none,’ low,’ etc.) differently. To 
correct for these factors, I benchmark the respondent’s attitudes about 
corruption in general by using the respondent’s answer to the question 
about the likelihood that there is corruption involving the President of 
Indonesia. As discussed above, the phrasing of the corruption question is 
the same as the question about the road project, but in this case all 
respondents are evaluating the same individual — the President of 
Indonesia. Since the person being evaluated is the same for all 
respondents, the different answers to this question captures general 
differences in the way the respondents evaluate corruption and answer 
the perceptions question. 8 This is analogous to the approach taken by 
Bassett and Lumsdaine (1999), who use responses to a question about 
the probability of the weather being sunny tomorrow to benchmark the 
overall optimism or pessimism of the respondents when interpreting 
questions about the respondent’s beliefs about future events. 9 

The results controlling for dummies corresponding to the different 
possible answers to the question about how corrupt the President is 
are presented in columns (2) and (5). The responses to the corruption 
question on the road project and the corruption question about the 
President are positively correlated (the dummy versions of these 
variables have correlation coefficient 0.16, p<0.001). Controlling for 
perceptions of how corrupt the President is substantially strengthens 
the results, increasing both the magnitudes and the statistical 
significance in both specifications. 10 

However, even controlling for the individual’s response about how 
corrupt the President is, it is possible that the correlation between 
missing expenditures in the road project and perceptions of corrup¬ 
tion in the road project reflects only villagers’ perceptions of the 
average levels of corruption in their village, rather than specific in¬ 
formation about the road project per se. 

To examine whether villagers have specific information about the 
road project perse, I estimate an alternative version of Eq. (1) that also 
controls as flexibly as possible for villagers’ reported perceptions 
about the general level of corruption in the village, denoted by q: 

p ( p vh = J) = &($ - P c v - x 'vi,y - q' s ) (2) 

1 -Pc v - X’vhJ - <J' 5 )- 

To capture as flexibly as possible the respondents’ general cor¬ 
ruption perceptions q , I include in q the respondents’ answers to the 
corruption questions about subdistrict officials, the village head, and 


8 One might be concerned that corruption perceptions of the President may also 
capture heterogeneity in overall attitudes towards the President of Indonesia rather 
than just benchmarking for how the respondent answers the corruption question. 
However, controlling for the respondent's overall approval of the President's job 
performance, rather than how corrupt they think the President is, has no effect on the 
correlation between perceptions of corruption in the road project and the missing 
expenditures variable. Conversely, controlling for any of the respondent's other 
answers to the corruption question — i.e., perceptions of subdistrict officials, village 
head, or village parliament — has a similar effect to controlling for the corruption of the 
President, although slightly smaller in magnitude. This suggests that the effect of 
controlling for perceptions of the President's corruption is due to capturing differential 
interpretations of the corruption question, rather than individual opinions of the 
President. 

9 This benchmarking exercise is also related to the anchoring vignettes literature in 
political science, discussed by King et al. (2004). The advantage of the approach used 
here relative to benchmarking against a hypothetical vignette is that the approach here 
captures differences in the respondents' reluctance to report corruption (due, for 
example, to fear of retaliation), which would not be captured in a hypothetical 
question. 

10 A natural question is why controlling for beliefs about the President changes the 
point estimates on the correlation, rather than just reduces the standard errors. 
However, if all people in a certain area believe there is more corruption, they may 
monitor more, reducing actual corruption levels. In fact, as discussed in Section 4.2 
below, the data is consistent with this mechanism — individuals who report any 
corruption in involving the President are more likely to attend one of the project 
accountability meetings. Such a mechanism would attenuate the raw correlation 
between beliefs and actual corruption unless one also controls for the overall average 
beliefs about corruption. 
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the village parliament (none of whom have any official role in the road 
project), as well as a variety of respondent-level control variables — 
age, gender, per-capita expenditure (predicted from assets), partici¬ 
pation in social activities, and family relationships to government and 
project officials. (The role of these respondent-level variables in 
predicting perceptions will be discussed in more detail in Section 1 
below.) As can be seen in columns (3) and (6) of Table 2, adding these 
many additional control variables reduces the standard errors but 
does not change the point estimates. This is despite the fact that, to 
take just one example, the correlation of respondents’ perceptions of 
corruption involving the village head and corruption involving the 
road project is 0.4. Thus, despite the relatively high correlation of 
these perceptions of different types of corruption, the results suggest 
that villagers are actually able to distinguish between general levels 
of corruption in the village and corruption in the road project in 
particular. 

To interpret the magnitudes of the estimated coefficients, consider 
the probit specification. The point estimate in column (6) suggests 
that a 10% increase in missing expenditures above the mean level — i.e. 
an increase of 0.024 from the mean level of 0.24 — would be associated 
with an increase in the probability the respondent reports any cor¬ 
ruption in the project of 0.0030, or an increase of about 0.8% over the 
mean level of 0.36. Put another way, the “elasticity” of a respondent 
reporting any likelihood of corruption with respect to the missing 
expenditures variable is about 0.08. Calculating the marginal effects 
from the ordered probit specifications gives results of similar mag¬ 
nitudes. While there is information about actual corruption levels in 
perceptions, the magnitude of this information is weak. 

An important question is whether this weak correlation is merely 
the result of measurement error in the missing expenditures measure, 
or actually reflects the fact that households have little information. 
Recall that to construct the missing expenditures measure, I used 
data from 10 core samples of each road, and between 3 and 5 price 
quotations for each type of material used. To investigate the role of 
measurement error, for each road I randomly split these 10 core 


Table 3 

Investigating measurement error. 



(1) 

(2) 

(3) 


Any likelihood of corruption in road project 
(dummy variable 0-1) 

Panel A: OLS linear model 

Missing expenditures 

0.096 

0.117** 

0.109** 

Corruption perceptions of 

(0.059) 

(0.055) 

(0.042) 

President 

No 

Yes 

Yes 

Subdistrict official 

No 

No 

Yes 

Village head 

No 

No 

Yes 

Village parliament 

No 

No 

Yes 

Respondent covariates 

No 

No 

Yes 

Sample controls 

Yes 

Yes 

Yes 

Observations 

3639 

3639 

3226 

Mean dep. var 

0.36 

0.36 

0.35 

Panel B: IV for measurement error 

Missing expenditures 

0.111* 

0.131** 

0.128*** 

Corruption perceptions of 

(0.061) 

(0.058) 

(0.044) 

President 

No 

Yes 

Yes 

Subdistrict official 

No 

No 

Yes 

Village head 

No 

No 

Yes 

Village parliament 

No 

No 

Yes 

Respondent covariates 

No 

No 

Yes 

Sample controls 

Yes 

Yes 

Yes 

Observations 

3639 

3639 

3226 

Mean dep. var 

0.36 

0.36 

0.35 


Notes: See notes to Table 2. Panel A replicates columns (4)-(6) of Table 2 using a linear 
probability model, rather than probit. Panel B replicates the same regressions 
instrumenting for missing expenditures calculated using half of the core samples 
with missing expenditures calculated using the other half of the core samples. 


Table 4 

Accuracy — prices vs. quantities. 



(1) 

(2) 

(3) 

(4) 

(5) 

(6) 


Likelihood of corruption in 
road project (ordered 
probit) 

Any likelihood of 
corruption in road project 
(dummy variable 0-1, 
probit marginal effects) 

Inflated prices 

0.433 

0.627** 

0.669*** 

0.177* 

0.205** 

0.204** 


(0.277) 

(0.270) 

(0.251) 

(0.096) 

(0.091) 

(0.081) 

Missing quantities 

0.057 

0.118 

0.112 

0.049 

0.069 

0.070 

Corruption perceptions of 

(0.183) 

(0.177) 

(0.155) 

(0.062) 

(0.060) 

(0.053) 

President 

No 

Yes 

Yes 

No 

Yes 

Yes 

Subdistrict official 

No 

No 

Yes 

No 

No 

Yes 

Village head 

No 

No 

Yes 

No 

No 

Yes 

Village parliament 

No 

No 

Yes 

No 

No 

Yes 

Respondent covariates 

No 

No 

Yes 

No 

No 

Yes 

Sample controls 

Yes 

Yes 

Yes 

Yes 

Yes 

Yes 

Observations 

Mean dep. var 

3314 

3314 

2931 

3639 

0.36 

3639 

0.36 

3226 

0.35 


Notes: See notes to Table 2. 

*Significant at 10%; ^^Significant at 5%; ***Significant at 1%. 


samples and 3-5 price quotations into two groups of 5 core samples 
and 1-3 price quotations each, and use these subsamples of mea¬ 
surements to construct two different estimates of missing expendi¬ 
tures for each village. I then repeat the regressions in columns (4)—(6) 
of Table 2 instrumenting for the measure of missing expenditure 
constructed using the first set of measurements with the measure of 
missing expenditure constructed using the second set of measure¬ 
ments. For comparison, OLS results (analogous to columns 4-6 of 
Table 2) are shown in Panel A of Table 3; the results using instru¬ 
mental variables to correct for measurement error are shown in 
Panel B of Table 3. The estimates in Panel B are only slightly larger than 
in Panel A (e.g., the coefficient in column (3) increases from 0.109 
in the OLS to 0.128 in the IV correcting for measurement error). Thus, 
at least to the extent I can detect it here, measurement error alone 
does not seem to explain the low correlation between perceptions and 
missing expenditures. 

3.2. Differential accuracy: prices vs. quantities 

There are multiple methods village officials can use to hide cor¬ 
ruption, and some of these methods may be easier for villagers to 
detect than others. In particular, village officials who steal a given 
amount have two options for how to account for this missing money 
in the accounts — they can either inflate the price paid for the 
materials procured, or they can inflate the quantities of the materials 
procured (or both). To examine how perceptions of corruption are 
formed, I re-estimate Eq. (1) with the missing expenditures variable 
separated into variables representing its constituent parts — “inflated 
prices” and “missing quantities.” Specifically, I define “inflated prices” 
as the difference in logs between the prices reported by the village and 
the prices measured by the independent survey team, weighted by 
the quantities reported by the village; similarly, I define “missing 
quantities” as the difference in logs between the quantities reported 
by the village and the quantities measured by the independent survey 
team, weighted by the prices reported. “Inflated prices” therefore 
captures markups in prices, while “missing quantities” captures markups 
in quantities. 

The results are presented in Table 4. All specifications confirm that 
villagers’ perceptions of corruption in the project are strongly pos¬ 
itively correlated with price markups, and only very weakly (and 
statistically insignificantly) correlated with markups in quantities. The 
estimated magnitudes for inflated prices are approximately double 
the magnitudes for missing expenditures overall. Market prices for 
commodities are commonly known to villagers, but quantities of 
commodities delivered are very difficult to estimate without careful 
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Table 5 

Robustness to alternative missing expenditures measures. 



(1) 

(2) 

(3) 

(4) 

(5) 

(6) 


Likelihood ofcormption in road project (ordered probit) 

Any likelihood of corruption in road project 
(dummy variable 0-1, probit marginal effects) 

Panel A: materials only 

Missing materials expenditures 

0.160 

0.281** 

0 313*** 

0.078 

0.105** 

0.108*** 


(0.148) 

(0.142) 

(0.116) 

(0.052) 

(0.049) 

(0.041) 

Observations 

3314 

3314 

2931 

3639 

3639 

3226 

Panel B: materials only 

Missing materials expenditures — prices 

0.418* 

0.611*** 

0.659*** 

0.171** 

0.201*** 

0.199*** 


(0.230) 

(0.231) 

(0.220) 

(0.081) 

(0.078) 

(0.070) 

Missing materials expenditures — quantities 

0.008 

0.101 

0.115 

0.023 

0.051 

0.056 


(0.163) 

(0.153) 

(0.133) 

(0.055) 

(0.052) 

(0.046) 

Observations 

3308 

3308 

2925 

3633 

3633 

3220 

Panel C: materials only, exclude price quotes from KDP project suppliers 
Missing materials expenditures — prices 0.402 

0.606** 

0.671*** 

0.170* 

0.200** 

0.206*** 


(0.255) 

(0.255) 

(0.233) 

(0.088) 

(0.085) 

(0.075) 

Missing materials expenditures — quantities 

-0.009 

0.082 

0.101 

0.027 

0.054 

0.065 


(0.202) 

(0.189) 

(0.156) 

(0.065) 

(0.062) 

(0.052) 

Observations 

3046 

3046 

2683 

3358 

3358 

2970 

Panel D: materials only, use price quotes from buyers only 

Missing materials expenditures — prices 

0.212 

0.391 

0.359 

0.124 

0.158* 

0.120 


(0.307) 

(0.303) 

(0.263) 

(0.098) 

(0.096) 

(0.089) 

Missing materials expenditures — quantities 

0.051 

0.044 

-0.002 

0.029 

0.031 

0.027 


(0.238) 

(0.211) 

(0.185) 

(0.074) 

(0.070) 

(0.074) 

Observations 

1484 

1484 

1353 

1650 

1650 

1499 

Notes for all panels 

Corruption perceptions of 

President 

No 

Yes 

Yes 

No 

Yes 

Yes 

Subdistrict official 

No 

No 

Yes 

No 

No 

Yes 

Village head 

No 

No 

Yes 

No 

No 

Yes 

Village parliament 

No 

No 

Yes 

No 

No 

Yes 

Respondent covariates 

No 

No 

Yes 

No 

No 

Yes 

Sample controls 

Yes 

Yes 

Yes 

Yes 

Yes 

Yes 


Notes: See notes to Table 2. In Panels A and B, missing expenditures, prices, and quantities are defined for materials (sand, rock, gravel) only, and exclude missing labor expenditures. 
In Panel C, missing materials prices is calculated using only price survey data from suppliers who had never supplied to the KDP program. In Panel D, missing materials prices is 
calculated using only price survey data from buyers of materials, not sellers. 

*Significant at 10%; **Significant at 5%; ***Significant at 1%. 


measurement, even for trained engineers; therefore, it is not sur¬ 
prising that villagers are better at detecting marked-up prices than 
inflated quantities. 

Given this result, it is interesting to compare the overall average 
levels of the inflated prices and missing quantities variables. After 
all, if villagers can detect marked-up prices but cannot detect marked- 
up quantities, village officials would in general choose to hide their 
corruption by inflating quantities rather than marking up prices. As 
discussed above, one needs to interpret the levels of the missing 
expenditures variables with caution, because the levels of these var¬ 
iables depend on assumptions about the loss ratios and on the ability 
of surveyors to obtain exactly the same prices as the villages procuring 
the material for the project. Nevertheless, the levels of the inflated 
prices and quantities variables are precisely what one would expect 
given the perceptions’ results: all of the missing expenditures are 
hidden by inflating quantities, not by inflating prices. Specifically, as 
shown in Table 1, the mean level of the missing quantities variable is 
0.24, while the mean level of the inflated prices variable is — 0.014, 
very close to zero. 11 Thus, on average the vast majority of the missing 
expenditures appears to be occurring exactly where villagers cannot 
detect it. This raises the possibility that the relatively low correlation 
between reported perceptions and missing expenditures may in part 
reflect the strategic behavior of savvy corrupt officials who deliber¬ 


11 Inflated prices could be less than 0 if, for example, villages purchasing materials 
received bulk discounts on purchase prices that were not offered to the independent 
survey team. 


ately choose the types of corruption that are hardest to detect. 12 It also 
suggests that there may be limits in the degree to which villagers can 
effectively monitor corruption, at least in the absence of external help 
detecting it. 

3.3. Robustness to alternative missing expenditures measures 

The missing expenditures measure variable contains four types of 
data: data from the accounting book for the roads project, data from 
the engineer’s assessment of the road project, a price survey, and a 
labor survey. Although the accounting data and the engineering data 
are objective measures, and not subject to reporting biases, it is pos¬ 
sible that respondents might systematically misreport their answers 
to the price or the labor survey. If the same omitted variable — say, 
ethnic heterogeneity — led to misreporting of corruption perceptions 
and misreporting on the price and labor components, it is possible 
that the omitted variable could be generating the correlations un¬ 
covered in the previous sections. 

To examine this possibility, in Table 5, 1 therefore repeat the ana¬ 
lysis above using different missing expenditures measures that pro¬ 
gressively seek to eliminate as much potential for reporting bias from 


12 A natural question is how to reconcile the facts that 1) there appears to be no 
price-markups on average and 2) villagers are able to detect price-markups. The 
answer is that the fact that the average price-markup being 0 masks the fact that some 
villages had higher-than-market prices, and others had lower-than-market prices. 
Villagers appear to detect these differences, and they are correlated with corruption 
perceptions. Perhaps the village officials in those villages where prices were marked- 
up did not realize that prices would be easier to detect than quantities. 
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the missing expenditures variable as possible. 13 First, to exclude 
potential biases from the labor survey, I examine missing materials 
expenditures — i.e. missing expenditures on the three main 
materials (rock, sand, and gravel) that go into the road project. 
This variable uses no data from the labor survey. Panel A of Table 5 
replicates the regressions in Table 2 examining missing materials 
expenditures, and Panel B replicates the regressions in Table 4 using 
missing materials prices and missing materials quantities. As is 
evident from Table 5, these results, which exclude all information 
from the labor survey entirely, are very similar to the main regres¬ 
sion results, suggesting that reporting biases in the labor survey are 
not driving the results. 

To examine potential biases in the price survey, I exploit the fact 
that the price survey interviewed three different types of respon¬ 
dents — sellers of materials who had supplied materials to any KDP 
program (KDP is the village infrastructure scheme studied in this 
paper), sellers of materials who did not supply materials to any KDP 
program, and a small numbers of independent buyers of materials 
(i.e., private individuals engaged in construction projects in the 
area). If there were systematic reporting biases, one would expect 
them to be most severe for those respondents who actually supplied 
to the KDP program. Moreover, one would expect very different 
types of misreporting for sellers and buyers. 14 

Panel C of Table 5 presents results using price data only from non- 
KDP sellers of materials, and Panel D of Table 5 presents results using 
price data only from independent buyers of materials. The results in 
Panel C are virtually identical to the results in Panel B, showing that 
there is no difference from excluding prices from those who sell to the 
project. The results in Panel D, where I use information from buyers 
only, are somewhat smaller and weaker statistically than the main 
results, but remain positive in all cases and cannot be statistically 
distinguished from the main results. The slightly smaller point es¬ 
timates are likely explained by the fact that I have very few buyer 
observations per village (there are an average of only 0.87 buyers 
surveyed in the price survey per village (i.e., not all villages had a 
buyer surveyed), as compared to 6.24 price surveys for all types of 
observations), increasing measurement error in prices and creating 
attenuation bias. All told, the results suggest that systematic mis¬ 
reporting on the labor and price components of the missing expen¬ 
ditures survey is not substantially driving the correlations between 
corruption perceptions and missing expenditures established in the 
previous section. 

4. Biases and feedback 

4.1 Are corruption perceptions systematically biased? 

This section examines whether certain types of individuals are 
systematically biased in their reported perceptions about corruption. 
To do so, I re-estimate a version of Eq. (2) that includes village fixed 
effects in addition to respondent-level variables. Since the actual level 
of corruption in the road project does not vary within the village — 
after all, there is only one road project in each village — if there are no 
individual biases, then once village fixed effects are included and once 
I benchmark for how respondents perceive the different possible 
answers to the corruption question, none of the individual character¬ 
istics in the regression should systematically predict corruption per¬ 
ceptions. If they do, then we know that those types of individuals 


13 Section 2 below discusses other tests for reporting biases in the corruption 
perceptions surveys. 

14 It is also important to recall that, as discussed above, all data on the price surveys 
came from interviews in surrounding villages, not from the village in question. Those 
being surveyed were also not informed that the survey had anything to do with the 
road-building project. These two sample design considerations were to minimize the 
possibility of reporting biases in the price survey. 


Table 6 

Are beliefs systematically biased? 



(1) 

(2) 

(3) 

(4) 


Likelihood of corruption 
in road project 
(continuous variable 
scaled to 0-1, OLS with 
fixed effects) 

Any likelihood of 
corruption in road project 
(dummy variable 0-1, 
conditional logit model) 

Education (years) 

0.004*** 

0.002** 

0.065*** 

0.051*** 


(0.001) 

(0.001) 

(0.018) 

(0.019) 

Age 

-0.001*** 

-0.001** 

-0.003 

-0.001 


(0.000) 

(0.000) 

(0.005) 

(0.005) 

Female 

-0.017*** 

-0.012** 

-0.183* 

-0.160 


(0.005) 

(0.005) 

(0.105) 

(0.108) 

Predicted per-capita 

0.024*** 

0.021** 

0.217 

0.148 

consumption 

(0.009) 

(0.009) 

(0.193) 

(0.199) 

Participation in social activities 

0.001 

0.000 

0.013** 

0.011* 


(0.000) 

(0.000) 

(0.006) 

(0.006) 

Participation in social activities 

-0.003*** 

-0.003** 

-0.075*** 

-0.069*** 

where road project likely 
discussed 

(0.001) 

(0.001) 

(0.021) 

(0.021) 

Lives in project hamlet 

-0.027*** 

-0.026*** 

-0.781*** 

-0.764*** 


(0.005) 

(0.005) 

(0.108) 

(0.110) 

Attended development meeting 

-0.006 

-0.005 

-0.312*** 

-0.320*** 


(0.005) 

(0.005) 

(0.110) 

(0.112) 

Family member of village 

0.008 

0.006 

0.043 

0.021 

government 

(0.005) 

(0.005) 

(0.112) 

(0.116) 

Family member of project leader 

-0.011 

-0.009 

-0.399** 

-0.402** 


(0.008) 

(0.008) 

(0.203) 

(0.205) 

Version B of survey form 

0.011 

0.011 

0.135 

0.124 


(0.009) 

(0.009) 

(0.169) 

(0.170) 

Sample controls 

Yes 

Yes 

Yes 

Yes 

President corruption perception 

No 

Yes 

No 

Yes 

Observations 

R-squared 

3727 

0.49 

3727 

0.51 

2675 

2675 

Mean dep. var 

0.40 

0.40 

0.40 

0.40 

Fixed effects 

Village 

Village 

Village 

Village 

p-value of joint F-test 

<0.01 

<0.01 

<0.01 

<0.01 


See notes to Table 2. President corruption perceptions refers to a dummy for the 
respondent's response to the corruption question the President of Indonesia, as in 
Table 2. Robust standard errors in parentheses. All specifications include village fixed 
effects. Note that the sample size is lower in the conditional logit specification since all 
villages where there is no variation in the dependent variable are automatically 
dropped from the conditional logit model. 

*Significant at 10%; ^^Significant at 5%; ***Significant at 1%. 

described by the variable in question are systematically biased either 
towards reporting or not reporting corruption in the project. 15 

Given the incidental parameters problem, rather than estimate an 
ordered probit or probit model with a large number of dummy vari¬ 
ables for each village, I instead estimate an OLS models with village 
fixed effects using the linearized version of the corruption perception 
variable (where the categorical responses are put on a scale from 0 
to 1), and a conditional logit model with the dummy version of 
the corruption perceptions variable. 15 The coefficients in the condi¬ 
tional logit models can be interpreted as log odds-ratios. 

The results are presented in Table 6. For each dependent variable, I 
present two sets of results — one with no additional controls, and one 
controlling for perceptions about the President, to control for the fact 


15 In interpreting these results, it is important to note that while I can estimate 
whether bias exists, I do not know which individuals are ‘biased’ and which are 
‘unbiased’. The reason is that the dependent variable, perceptions of corruption, does 
not have a numeric scale that we know should be comparable to the missing 
expenditures variable. Thus, unlike the literature evaluating subjective probabilities 
(e.g., Dominitz and Manski, 1997; Hurd and McGarry, 2002), I cannot say which 
individuals are right and which are wrong or whether the perceived level of corruption 
is “right on average”; rather, I can only say that conditional on the actual level of 
corruption, those with high education are more likely to report higher levels of 
corruption than those with low levels of education. 

16 Specifically, for the linearized version, I assign a value of 0 to a response of‘none’, 1 
to a response of ‘low’, 2 to a response of ‘medium’, etc. Note that for the dummy 
version of the variable, I find that linear probability models with fixed effects, rather 
than conditional logit models, produce qualitatively similar results. 
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that some respondents may have interpreted the multiple response 
categories differently from others. 

Individual-level biases in reported perceptions appear quite sig¬ 
nificant. Conditional on village fixed effects, better educated respon¬ 
dents and male respondents tend to report more corruption; those 
who participate in the types of social activity where the project was 
likely to be discussed, those who live near the project, and (naturally) 
those who are related to the head of the project all tend to report less 
corruption. 1 Taken together, these individual-level biases are highly 
significant — the p-value from a joint test of these characteristics is 
less than 0.01 in all specifications. 

Not only are these biases statistically significant, they are large in 
magnitude as well. For example, the results show that each year of 
education makes an individual between 5 and 7 percentage points more 
likely to report corruption in the project. This implies that, holding 
actual corruption levels constant, the “elasticity” of the probability of 
reporting any likelihood of corruption with respect to the respondent’s 
education is between 0.17 and 0.22 — considerably larger than the 
impact of the actual missing expenditures variable discussed above. 

The main conclusion from these results is that these individual 
biases are very substantial, especially when compared to the mag¬ 
nitude of the correlation between missing expenditures and reported 
corruption perceptions found above. This suggests that the signal- 
to-noise ratio in reported perceptions is quite low, which may also 
help explain the low overall correlation between perceptions and 
missing expenditures. 

4.2. Biased beliefs vs. biased reporting 

An important question is whether the biases in corruption per¬ 
ceptions documented above represent systematic differences in in¬ 
dividuals’ true beliefs about the level of corruption, or are instead 
biases in how individuals report their true beliefs. If there are biases in 
true beliefs, those biases might affect the degree to which individuals 
monitor corruption and punish corrupt officials, whereas if they are 
biases in reporting, they might not. 18 

Since true beliefs cannot be observed directly, it is impossible to 
conclusively disentangle biased beliefs from biased reporting. How¬ 
ever, there are several ways we can begin to make progress on this 
issue. First, 630 respondents in villages receiving the ‘comment form’ 
experimental treatment (Olken, 2007) were randomly allocated to 
receive one of two versions of the survey form: Form A, in which 
respondents were reminded that their responses to the corruption 
questions would be confidential, and Form B, in which respondents 
were told that while their responses to the corruption questions 
would be confidential, they would be summarized and a summary 
would be read at a village ‘accountability meeting’. 19 The purpose of 
this randomization was to investigate whether respondents would 
change the reported amounts of corruption if they knew it the results 
would feed back into the village monitoring process. 20 

The effects of the Form B treatment are investigated in Table 7. 
First, Columns (1) and (3) of Table 7 repeat the same regressions in 
Table 6 on the subsample of individuals where the Form B ran¬ 
domization was conducted, using both the linearized and dummy 


17 An interesting question is whether these individual characteristics lead to 
respondents being more or less accurate at detecting corruption, not just biased. To 
examine this, I also interacted each of these individual characteristics with the missing 
expenditures variable. Across a wide range of specifications, I found no evidence of 
such interactions (results not reported). 

18 A model developing this point explicitly can be found in the working paper version 
of this paper (Olken, 2006b). 

19 These accountability meetings occur in all villages as part of the normal KDP 
process: the only difference due to the randomization is whether survey respondents 
were told that their responses to the corruption question would be included in the 
aggregated, anonymous comments discussed at the accountability meeting. 

20 Note that all regressions in the paper include a Form B dummy variable, to ensure 
that this randomization is not affecting the main results. 


Table 7 

Biased beliefs vs. biased reporting? 



(1) 

(2) 

(3) 

(4) 

Likelihood of 
corruption in road 
project (continuous 
variable scaled to 0-1, 
OLS with fixed effects) 

Any likelihood of 
corruption in road 
project (dummy variable 
0-1, conditional logit 
model) 

Education (years) 

-0.002 

0.002 

0.006 

0.075 


(0.003) 

(0.004) 

(0.051) 

(0.077) 

Age 

-0.002** 

- 0.000 

-0.011 

-0.002 


(0.001) 

(0.001) 

(0.014) 

(0.019) 

Female 

-0.032** 

-0.004 

-0.373 

-0.776** 


(0.015) 

(0.018) 

(0.284) 

(0.394) 

Predicted per-capita consumption 

0.021 

-0.009 

-0.227 

-1.029* 


(0.028) 

(0.038) 

(0.408) 

(0.578) 

Participation in social activities 

0.002 

0.000 

0.053** 

0.062 


(0.002) 

(0.002) 

(0.027) 

(0.039) 

Participation in social activities 

-0.005 

- 0.000 

-0.146* 

-0.214 

where road project likely 

(0.005) 

(0.006) 

(0.083) 

(0.141) 

discussed 





Lives in project hamlet 

-0.029** 

-0.031* 

-0.928*** 

_-[ 138*** 


(0.015) 

(0.017) 

(0.338) 

(0.393) 

Family member of village 

0.011 

0.031 

-0.071 

0.253 

government 

(0.017) 

(0.022) 

(0.339) 

(0.395) 

Family member of project leader 

-0.062** 

-0.028 

-0.419 

0.158 


(0.029) 

(0.034) 

(0.539) 

(0.719) 

Form B version of survey 

0.011 

-0.398 

0.129 

-15.812 


(0.012) 

(0.574) 

(0.225) 

(12.926) 

Form Bx education (years) 


-0.008 


-0.114 



(0.006) 


(0.104) 

Form B x age 


-0.003** 


-0.020 



(0.001) 


(0.025) 

Form B x female 


-0.063** 


0.697 



(0.028) 


(0.592) 

Form B x predicted per-capita 


0.051 


1.495 

consumption 


(0.052) 


(1.135) 

Form Bx participation in social 


0.003 


-0.018 

activities 


(0.003) 


(0.045) 

Form Bx participation in social 


-0.008 


0.123 

activities where road project 


(0.008) 


(0.148) 

likely discussed 





Form Bx lives in project hamlet 


0.020 


0.214 



(0.027) 


(0.476) 

Form Bx family member of village 


- 0.040 


-0.608 

government 


(0.033) 


(0.565) 

Form B x family member of project 


-0.071 


-0.795 

leader 


(0.056) 


(1085) 

Sample controls 

Yes 

Yes 

Yes 

Yes 

President corruption perception 

Yes 

Yes 

Yes 

Yes 

Observations 

502 

502 

428 

428 

R-squared 

0.52 

0.54 



Mean dep. var 

0.10 

0.10 

0.37 

0.37 

Fixed effects 

Village 

Village 

Village 

Village 

p-value of joint F-test of 

0.09 

0.51 

0.02 

0.12 

main effects 





p-value of joint F-test of 


0.11 


0.63 

Form B interactions 






See notes to Table 2. Village head corruption perception and President corruption refer 
to dummies for the respondent's response to the corruption question about village head 
and President of Indonesia, respectively, as in Table 2. Robust standard errors in 
parentheses. All specifications include village fixed effects. 

^Significant at 10%; **Significant at 5%; ***Significant at 1%. 

versions of the reported corruption perception variable, respectively. 21 
As in Table 6, the linearized versions are analyzed using OLS fixed 
effects models, and the dummy versions are analyzed using condi¬ 
tional logit models. The coefficient on receiving the Form B treatment 
shows no significant differences in the average level of reported 
corruption between the two versions of the form. 


21 The one variable from Table 6 that is not included is the “attend development 
meeting” variable, because those households who were sampled because they 
attended the development meetings were not included in the Form A/Form B 
experiment. The number of observations in these regressions is less than 630 due to 
the missing values of several covariates. 
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Next, columns (2) and (4) of Table 7 report results where I interact 
all of the respondent characteristics in Table 6 with the Form B 
dummy. To the extent that the individual-level biases documented in 
Table 6 are reporting biases, rather than belief biases, we expect them 
to be more pronounced with the Form B version of the form — i.e., 
reporting biases should be more pronounced for those people who are 
told that their responses will actually be used to inform the political 
decisions surrounding monitoring the project. The p-value from a 
joint test of all interactions is included at the bottom of the table. 

The results from this test provide little evidence for systematic 
reporting biases. Only two interactions with the Form B variable (with 
respondent age and a female dummy) are statistically significant, and 
even these variables are only significant on the linearized variable, not the 
categorical variable. In fact, the point estimate on the Form B x female 
variable is actually of the opposite sign in column (4) when the dummy 
corruption variable is used. The p-value from the joint test of all Form B 
interactions is 0.11 in column (2) (linearized corruption variable) and 
0.63 in column (4) (dummy corruption variable). Though this test is by no 
means definitive, it suggests that many of the biases found here may 
represent biases in beliefs rather than biases in reporting. 

A second way of examining whether the biases in corruption 
perceptions shown in Table 6 are actually biases in beliefs is to examine 
whether they translate into different levels of monitoring activity, i.e., 
are those individuals who report that there is more corruption more 
likely to participate in monitoring local officials? To separate out biases 
in beliefs from actual information about corruption, I consider the 
following question: conditional on village fixed effects (so holding 
actual levels of corruption constant), are those individuals who report 
that the President of Indonesia is more likely to be corrupt more likely 
to attend monitoring meetings for the road project? By looking at 
corruption perceptions about the President of Indonesia, rather than 
the road project, I isolate the relationship between monitoring and 
general attitudes about corruption, and exclude idiosyncratic informa¬ 
tion the respondent might have had about the road project per se that 
might cause him or her to attend a monitoring meeting. 

To investigate this, Table 8 presents the results from a conditional 
logit regression, where the dependent variable is a dummy for attending 
either any village meeting about the road project (columns 1 and 2) or 
attending an ‘accountability meeting’ for the road project (columns 3 
and 4), where village officials are required to account for how they spent 
the project funds. The key independent variable is corruption percep¬ 
tions of the President, either the linear measure (columns 1 and 3) or the 
dummy variable version (columns 2 and 4). All household controls from 
Table 6 are included in the regression, and the village is the conditioning 
variable. The coefficients can be interpreted as log odds-ratios. 

The results show that those individuals who report that the Pres¬ 
ident of Indonesia is likely to be corrupt are substantially more likely 
to attend monitoring meetings about the road project. Taking the 
point estimates in column (4), for example, individuals who report 
any corruption involving the President of Indonesia are 58% (0.458 log 
points) more likely to attend project monitoring meetings than in¬ 
dividuals who do not report any corruption involving the President. 
These results provide suggestive evidence not only that some of the 
biases in reported corruption may represent real beliefs, not just 
reporting biases, but also that these biases in beliefs may translate into 
real monitoring behavior. 

4.3. Are aggregate biases substantial? 

The previous section showed that certain types of individuals are 
systematically biased in their perceptions of corruption, and pre¬ 
sented some suggestive evidence that these biases are correlated with 
actual decisions about how much to monitor potentially corrupt 
officials. For these biases to feed back to affect monitoring and, ul¬ 
timately, corruption levels, these individual biases would have to be 
both large and correlated with village characteristics. 


Table 8 

Corruption perception and attendance at monitoring meetings. 



(1) 

(2) 

(3) 

(4) 


Attend any village 
meeting about the 
road project 
(dummy variable 
0-1, conditional 
logit model) 

Attend village 
accountability meeting 
for road project 
(dummy variable 0-1, 
conditional logit model) 

Corruption perceptions of President 
(linearized from 0-1) 

Any corruption involving President 
(dummy variable) 

0.160** 

(0.078) 

0.603*** 

(0.205) 

0.173* 

(0.094) 

0.458* 

(0.253) 

Education (years) 

0.043* 

0.044* 

0.083** 

0.087*** 


(0.024) 

(0.024) 

(0.034) 

(0.033) 

Age 

0.019*** 

0.019*** 

0.021** 

0.021** 


(0.007) 

(0.007) 

(0.008) 

(0.009) 

Female 

-0.185 

-0.183 

-0.678*** 

-0.683*** 


(0.164) 

(0.163) 

(0.235) 

(0.234) 

Predicted per-capita consumption 

0.406 

0.339 

0.284 

0.220 


(0.255) 

(0.257) 

(0.333) 

(0.334) 

Participation in social activities 

0.002 

0.001 

-0.020* 

-0.020* 


(0.008) 

(0.008) 

(0.011) 

(0.010) 

Participation in social activities where 

0.046 

0.050 

q T14*** 

0.116*** 

road project likely discussed 

(0.034) 

(0.035) 

(0.042) 

(0.041) 

Lives in project hamlet 

0.674*** 

0.671*** 

0.959*** 

0.943*** 


(0.169) 

(0.167) 

(0.216) 

(0.215) 

Family member of village government 

0.258* 

0.271* 

0.586*** 

0.599*** 


(0.155) 

(0.157) 

(0.208) 

(0.209) 

Family member of project leader 

-0.186 

-0.186 

0.007 

-0.009 


(0.222) 

(0.219) 

(0.343) 

(0.341) 

Version B of survey form 

0.135 

0.124 

0.019 

0.019 


(0.169) 

(0.170) 

(0.025) 

(0.025) 

Sample controls 

Yes 

Yes 

Yes 

Yes 

Observations 

1249 

1249 

829 

829 

Fixed effects 

Village 

Village 

Village 

Village 


Notes: See notes to Table 6. All specifications are conditional logit models, where the 
village is the conditioning variable. Coefficient estimates are expressed as log odd ratios. 
Robust standard errors in parentheses. Note that the sample size is lower in columns (3) 
and (4) as, in the conditional logit specification, all villages where there is no variation 
in the dependent variable are automatically dropped from model. 

This section examines empirically whether aggregate biases are 
substantial enough to affect qualitative conclusions about the cor¬ 
relates of corruption. In doing this, it is important to note that I do not 
necessarily claim a causal interpretation of the relationship between 
these variables and corruption; rather, the main question of interest 
is the consistency of the partial correlations between these variables 
and corruption across the various ways of measuring corruption. 

To examine aggregate biases, I estimate the following two re¬ 
gressions via OLS: 

c v — a i + Z' v ol 2 + v v (3) 

Pvh = f>l + 2 + XvhP3 + V v (4) 

and examine the similarity or difference between the coefficients a 2 
and j 8 2 , which capture the impact of village characteristics Z on 
missing expenditures and perceived corruption, respectively. To 
obtain the most comparable possible coefficients across these very 
different measures, I normalize all of the corruption measures to have 
mean 0 and standard deviation 1, so that all coefficients can be inter¬ 
preted in terms of standard deviation changes in the corruption 
measure. I denote the normalized versions of missing expenditures by 
c v and the normalized version of perceptions by P vh . 22 That being said, 

22 For the categorical-response perceptions variable, I impose a linear scale on the 
variable, and then normalize this linearized variable to have mean 0 and standard 
deviation 1. Although this imposes a linearized form on categorical response variable, 
as discussed in Footnote 1 above, in other specifications OLS regressions using this 
linearized variable produce qualitatively similar results to the ordered probit specifica¬ 
tions, which suggests that the linear assumptions are not substantially affecting the 
results. I have also considered ordered probit and probit versions of Eq. (4), and they 
produce qualitatively similar results to those in Table 9 below. Similarly, for the binary 
dependent variable, I normalize the variable to have mean 0 and standard deviation 1. 
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Table 9 

Village-level differences. 



(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 


Missing 

expenditures 

Likelihood of corruption in 
road project (linear scale, 

Std Dev 1) 

Any likelihood of corruption 
in road project (dummy 
variable, Std Dev 1) 

Trust other villagers (dummy 
variable, Std Dev 1) 

Demographics 

Log population 

0.263** 

0.176*** 

0.129** 

0.175*** 

0.119** 

-0.142* 

-0.156** 


(0.112) 

(0.061) 

(0.055) 

(0.058) 

(0.051) 

(0.076) 

(0.075) 

Mean village education level (years) 

-0.040 

-0.052 

-0.021 

-0.049 

-0.007 

-0.010 

- 0.024 


(0.047) 

(0.032) 

(0.031) 

(0.034) 

(0.033) 

(0.041) 

(0.043) 

Share of population poor 

-0.335 

- 0.143 

-0.123 

-0.113 

-0.062 

0.406** 

0.437** 


(0.252) 

(0.165) 

(0.133) 

(0.159) 

(0.140) 

(0.191) 

(0.191) 

Social characteristics 

Ethnic fragmentation 

-1.449** 

1 721 *** 

1.297*** 

1.928*** 

1.467*** 

-1.082* 

-0.929 


(0.568) 

(0.322) 

(0.293) 

(0.340) 

(0.332) 

(0.593) 

(0.651) 

Religious fragmentation 

-1.350 

0.082 

-0.318 

0.092 

-0.301 

-1.031 

-0.822 


(1.089) 

(0.734) 

(0.705) 

(0.721) 

(0.700) 

(0.756) 

(0.755) 

Intensity of social participation 

0.024 

-0.054* 

- 0.041 

-0.073** 

-0.063** 

0.084** 

0.077* 


(0.064) 

(0.029) 

(0.025) 

(0.029) 

(0.026) 

(0.042) 

(0.043) 

Transparency 

Meetings with written accountability report 

-0.243 

-0.021 

-0.007 

-0.077 

-0.082 

-0.231** 

-0.232** 


(0.155) 

(0.093) 

(0.074) 

(0.089) 

(0.076) 

(0.114) 

(0.113) 

Number of ordinances from village parliament 

-0.019 

0.007 

0.015 

0.012 

0.015 

0.006 

0.007 


(0.017) 

(0.012) 

(0.009) 

(0.011) 

(0.010) 

(0.013) 

(0.013) 

Experimental interventions 

Audit treatment 

-0.302** 

-0.053 

-0.125* 

-0.028 

-0.064 

-0.179 

-0.152 


(0.130) 

(0.088) 

(0.070) 

(0.087) 

(0.072) 

(0.115) 

(0.118) 

Invitations treatment 

-0.030 

0.024 

0.026 

-0.006 

0.023 

-0.054 

-0.054 


(0.106) 

(0.077) 

(0.068) 

(0.069) 

(0.064) 

(0.069) 

(0.070) 

Invitations + comment treatment 

-0.022 

0.161* 

0.121 

0.108 

0.095 

0.091 

0.123 

Corruption perceptions of 

(0.094) 

(0.089) 

(0.073) 

(0.088) 

(0.077) 

(0.082) 

(0.084) 

President 

N/A 

No 

Yes 

No 

Yes 

No 

Yes 

Subdistrict official 

N/A 

No 

Yes 

No 

Yes 

No 

Yes 

Village head 

N/A 

No 

Yes 

No 

Yes 

No 

Yes 

Village parliament 

N/A 

No 

Yes 

No 

Yes 

No 

Yes 

Respondent covariates 

N/A 

No 

Yes 

No 

Yes 

No 

Yes 

Sample controls 

N/A 

Yes 

Yes 

Yes 

Yes 

Yes 

Yes 

Observations 

443 

3056 

2716 

3366 

2996 

3302 

2954 

R-squared 

0.06 

0.06 

0.23 

0.06 

0.17 

0.05 

0.05 


Dependent variable in column (1) is missing expenditures; dependent variable in columns (2) and (3) is the linearized variable of corruption perceptions described in the text, 
dependent variables in columns (4) and (5) are dummy variables of corruption perceptions, and dependent variables in columns (6) and (7) are a dummy for trusting other villagers. 
Note that all dependent variables have been rescaled to have mean 0 and standard deviation 1. Observations in columns (2)-(7) are weighted by the inverse of the number of 
observations in each village, to ensure that each village receives the same weight as in column (1). Estimation is by OLS, though as discussed in the text, estimation of columns (2) and 
(3) by ordered probit and columns (4) and (5) by probit produces qualitatively similar results. Sample controls are as defined in the notes to Table 2; household controls are all of the 
individual respondent-level variables considered in Table 6; village head and President corruption perception refer to dummies for how the respondent answered the corruption 
questions about the village head and President of Indonesia, respectively, as included in column (3) of Table 2. Robust standard errors in parentheses, adjusted for clustering at 
subdistrict level.*Significant at 10%; **Significant at 5%; ***Significant at 1%. 

I will focus primarily on those results for which the estimated coef¬ 
ficients a 2 and jS 2 are of different sign, not just of different magnitude, 
so as not to rely too heavily on this normalization. 23 


23 An alternative, equivalent approach which does not rely on these normalizations is 
as follows. Suppose the true model of the world is: 

c v oq + Z v cx 2 (5) 

Bvh = h] + Z{J3>2 + ^vhP> 3 + J }) 4 C v + v' v . (6) 

The coefficient of interest is a 2 , the effect of village characteristics Z on real corruption 
levels. In most circumstances, c is unobserved. If we assume that / 3 2 = 0 (i.e., Z v affects B vh 
only through its effect on c,,), then the estimated coefficient in Eq. (4) in the text will be 
equal to a 2 , the coefficient of interest. In the setting here, we also observe c, so we can 
estimate Eq. (6) directly and test whether j 3 2 = 0 — i.e., we can test directly whether Z has 
no direct effect on perceptions other than through its effect on the percent missing. This is 
equivalent to testing whether a 2 = j 8 2 in Eqs. (3) and (4) in the text. Estimating Eq. (6) and 
testing whether /3 2 = 0 yields similar results to those presented in Table 9 and discussed 
below. In particular, the coefficients jS 2 in Eq. (6) are statistically significantly different from 
0 for the mean village education level, ordinances from village parliament, social 
participation, and village ethnic fragmentation. 


The results are presented in Table 9. Column (1) presents the results 
when missing expenditures is the dependent variable, columns (2) and 
(3) present the result when the scaled linear version of perceptions is 
the dependent variable, and columns (4) and (5) present the results 
when the scaled dummy version of perceptions is the dependent 
variable. Columns (2) and (4) do not include the controls for corruption 
perceptions of the President, the village head, etc. or the respondent- 
level characteristics included in Table 6; columns (3) and (5) do. 

The results suggest that for identifying the effects of basic demo¬ 
graphic characteristics, such as population and education, the results 
from perceptions (columns 2-5) appear to give similar results to the 
more objective missing expenditures measure (column 1). But when 
considering characteristics related to trust — such as ethnic heterogeneity 
and social participation — examining the impact on corruption percep¬ 
tions rather than actual corruption may lead to biased conclusions. 

Of particular note are the estimates on ethnic heterogeneity. The 
cross-country corruption literature has found that heterogeneity is posi¬ 
tively associated with corruption perceptions (e.g., Mauro, 1995; LaPorta 
et al., 1999). Following the standard approach in the literature, I construct 
as a measure of ethnic and religious heterogeneity the probability that 










962 


B.A. Olken / Journal of Public Economics 93 (2009) 950-964 


two randomly drawn individuals are from different ethnic or religious 
groups, respectively. 24 Consistent with the literature, I find that ethnic 
heterogeneity is associated with greater perceived levels of corruption. 
Specifically, moving from a village with no ethnic heterogeneity to a village 
with the maximum ethnic heterogeneity in the sample (0.51) is associated 
with an increase of between 0.65 and 1 standard deviations in the 
perceived corruption measure, equivalent to an increase of about 50 
percentage points in the probability of reporting positive corruption in the 
project. 25 However, when I examine the relationship between ethnic 
heterogeneity and the missing expenditures measure, I get the opposite 
result — moving from a village with no ethnic heterogeneity to a village 
with the maximum ethnic heterogeneity in the sample is associated with a 
decrease in the percent missing variable of about 0.73 standard deviations. 
The coefficients on religious fragmentation show a similar pattern — a 
large negative coefficient when missing expenditures is the dependent 
variable, and coefficients much closer to zero (and in some cases positive) 
when perceptions are the dependent variable — though the results on 
religious heterogeneity are not statistically significant. 

One possible explanation for these differences is, much as re¬ 
spondents who believe the President of Indonesia is more likely to be 
corrupt are more likely to go to monitoring meetings (see Section 4.2 
above), ethnic heterogeneity lowers the level of trust in the village — 
resulting in higher perceived levels of corruption, more monitoring, and 
lower actual corruption. In fact, there is suggestive evidence consistent 
with this explanation. The household survey asked respondents a 
version of the World Values Survey trust question, in which respondents 
were asked about the degree to which they trust other residents of the 
village. 26 On average, 52% of residents in villages with ethnic hetero¬ 
geneity less than 0.05 reported trusting their fellow villagers, whereas 
only 36% of residents in villages with ethnic heterogeneity greater than 
0.05 reported trusting their fellow co-residents. 

To examine this relationship more systematically, columns (6) and 
(7) of Table 9 regress trusting other villagers on the same set of village 
characteristics used before. The results (which are expressed as standard 
deviations to be comparable to other rows in the tables) show that the 
negative relationship between ethnic heterogeneity and trust is 
statistically significant (column 6), though it attenuates very slightly 
when we also include individual respondent covariates (column 7). 
Moreover, in high ethnic heterogeneity villages (defined similarly using 
the 0.05 threshold), the number of people who attend accountability 
meetings was 22% higher than in villages with low ethnic hetero¬ 
geneity. 2 These findings provide suggestive evidence that lower levels 
of trust correlated with ethnic heterogeneity lead to more negative 
corruption perceptions, which in turn lead to higher levels of moni¬ 
toring, lowering actual corruption levels. 28 


24 Overall, the sample is relatively homogeneous — the probability that two 
individuals in the same village are from different ethnic groups is greater than 0.05 
in only 9% of villages, and the probability that two individuals in the same village are 
from different religious groups is greater than 0.05 in only 10% of villages. In results not 
reported in the table, I have verified that the results are qualitatively similar if I remove 
outliers in the ethnic heterogeneity variable. 

25 Moreover, controlling for the overall heterogeneity in the village, those respondents 
whose ethnic group differed from that of the village head were 12 percentage points more 
likely to report positive probability of corruption in the project (results not shown). 

26 Specifically, they were asked: “In general, do you think that other residents of the 
village can be trusted, or you have to be careful in dealing with them?” The variable is 
coded 1 if the respondents say they can trust other residents of the village, and 0 if 
they say they have to be careful in dealing with them. 

27 This difference is statistically significant at the 1% level. The point estimates are 
virtually identical if the village-level characteristics included in Table 9 are included as 
well (except, obviously, ethnic heterogeneity). Using a linear ethnic heterogeneity 
measure, rather than the discrete cutoff for ethnic heterogeneity greater than 0.05, 
gives very similar results. 

28 This feedback mechanism may also explain the difference between the result in 
this paper that ethnic heterogeneity leads to less corruption and results elsewhere. For 
example, the rice program studied in Olken (2006a) had no participatory monitoring 
mechanism, so the feedback mechanism postulated here between ethnic hetero¬ 
geneity, increased monitoring, and less corruption would not have occurred in the case 
of the rice program. 


A similar effect to ethnic heterogeneity — though in the opposite 
direction — can be seen by looking at the social participation variables. 
I define the intensity of social participation as the average number of 
times an adult in the village participated in a social group of any kind 
in the past 3 months. This measure is obtained from a census of social 
groups obtained from the head of each hamlet. As can be seen in 
Table 9, increased participation in social groups in the village is asso¬ 
ciated with a decrease in perceived corruption levels. This is consistent 
with the results reported by Putnam et al. (1993). But when we look at 
the actual corruption level, we find, if anything, that increased social 
participation is associated with higher measured corruption levels, 
though the point estimate is statistically insignificant. Similar, though 
weaker, differences between the perceptions variable and missing 
expenditures appear when we consider a measure of whether there is 
a political opposition in the village that could potentially monitor the 
project — the degree of activeness of the village parliament, or BPD. 29 

4.4. Using perceptions to detect experimental impacts 

Given the difficulties in measuring corruption directly, an impor¬ 
tant question is the degree to which perceptions data can substitute for 
more direct measures of corruption in cases where the latter is difficult 
or infeasible to collect. To examine this, Table 9 also examines how the 
experimental results reported in Olken (2007) would have differed 
had the perceptions-based measure been used to evaluate corruption 
instead of the missing expenditures measure. As described above, 
there were three experimental interventions in these villages — an 
audit treatment, in which villages were told in advance that they would 
be audited by the central government audit agency with probability 1, 
an invitations treatment, where hundreds of written invitations were 
passed out to villagers to attend accountability meetings, and an ano¬ 
nymous comment form treatment, where villagers were able to give 
comments about the project without fear of retaliation. 

As can be seen in column (1) of Table 9, the audit intervention was 
associated with a statistically significant reduction in missing expen¬ 
ditures of about 0.3 standard deviations, whereas the invitations and 
invitations plus comment forms treatments were associated with a 
very small, and statistically insignificant, reduction in the missing 
expenditures variable. By contrast, when examining the perceptions 
variable, the audit treatment has a much smaller (and in most spe¬ 
cifications not statistically significant) effect, and the invitations and 
invitations plus comment form treatments are associated with in¬ 
creases in the perceptions of corruption, in some cases statistically 
significantly so. The conclusions from the study would therefore have 
been quite different had perceptions been the main measure of cor¬ 
ruption, rather than missing expenditures. 

In this particular case, we can speculate as to some of the specific 
reasons why perceptions and actual corruption would respond differ¬ 
ently to the experimental treatments. For example, as reported in Olken 
(2007), the audits primarily resulted in a reduction in missing quan¬ 
tities, whereas the results in Table 4 show that villagers are much better 
at detecting inflated prices. Villagers’ perceptions of corruption do not 
detect precisely the type of corruption where the experiments had the 
greatest impact. Also, one can also easily imagine that anonymous 
comment forms would increase people’s perceptions of corruption by 
providing information about corruption, while in fact having the op¬ 
posite effect on actual corruption levels. More generally, the difference 
in the results between the two types of measures highlights the im¬ 
portance of obtaining unbiased, direct measures of corruption. 


29 To measure how active the BPD was, I examine the number of ordinances the BPD had 
issued in the previous year. Though the BPD coefficients are not separately significant in 
either the missing expenditures or perceptions regressions, the difference between them 
appears to be statistically significant using the method discussed in Footnote 5. I also 
consider another measure of transparency in the project — whether a written report of 
project expenses was provided at a village accountability meeting — though the results are 
not conclusive in either direction. 
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5. Conclusion 

This paper has examined the relationship between perceptions of 
corruption and a more objective measure of corruption, in the context 
of a road-building program in rural Indonesia. The paper shows em¬ 
pirically that villagers' perceptions of corruption do appear to be 
positively (though weakly) correlated with the more objective mis¬ 
sing expenditures measure. Moreover, villagers appear to be able to 
distinguish between the overall probability of corruption in the village 
and corruption specific to the road project. 

Despite this, the magnitude of the correlation between reported 
corruption perceptions and missing expenditures is small. In part, this 
may be because, on average, almost all of the corruption in the project 
was hidden by inflating quantities, which are hard for villagers to 
detect, rather than marking up prices, which are easier for villagers to 
detect. This suggests an important feedback mechanism between 
transparency — which increases the ability of citizens to detect cor¬ 
ruption — and corruption levels. It also suggests that, at least in this 
case, villagers do not currently possess enough capability to detect 
corruption to effectively monitor local officials, at least without ad¬ 
ditional external help. 

I then examine the extent of biases in corruption perceptions. The 
results show that there are significant individual-level biases in how 
respondents answer the corruption question. Moreover, there is evi¬ 
dence that for some village-level characteristics, particularly those 
associated with levels of trust, such as ethnic heterogeneity and social 
participation, using perceptions to measure corruption can produce 
very different answers from the results obtained using a more objec¬ 
tive measure of corruption. I present suggestive evidence in favor of 
the idea that biases in individual's views about corruption can lead to 
increased monitoring behavior, which may in turn reduce corruption. 
These results suggest that perceptions data should be used for em¬ 
pirical research on the determinants of corruption with considerable 
caution, and that there is little alternative to continuing to collect 
more objective measures of corruption, difficult though that may be. 

Appendix A. Data details 

The original data was collected in 608 total villages. The sample in 
this paper, however, is limited to the 477 villages where the missing 
expenditures variable, described above, could be constructed. The 
missing expenditures variable could not be calculated in some villages 
for one of four reasons: (1) surveyor error in locating the road, (2) the 
project consisted largely of a partial rehabilitation of an existing road, 
(3) agglomerated expenditures reports (i.e., the village expenditure 
report combined expenditures in the road project with other projects 
that could not be independently measured, such as a school), or (4) 
villages that had asphalted the road that refused to let the engineers 
break the asphalt to conduct the engineering survey. 

The household survey was designed as a stratified random sample, 
containing between six and thirteen respondents per village, selected 
as follows. Two respondents were selected from the hamlets in which 
the road was located by first randomly selecting a hamlet, and then 
randomly selecting a neighborhood (RT) in that hamlet. A complete 
list of households in the RT was obtained from the neighborhood 
head, and two households were drawn randomly from that list. In¬ 
dividual respondents were drawn from a list of all adults age 18 or 
over in the selected households. Two additional respondents were 
selected from the hamlets in which the road was not being built using 
the same procedure. As men in the village tend to participate much 
more in road construction activities, the randomization was designed 
such that, of the four respondents selected in this manner, three were 
men and one was a woman. In villages receiving the Comment Form 
treatment, an additional four respondents were drawn using the same 
procedure, two from hamlets with the project and two from hamlets 
that did not contain the project. In each village, two additional 


respondents were drawn randomly from the attendance list of Village 
Meeting II, which was held before the randomization was announced, 
and is therefore exogenous with respect to the experiments. Finally, in 
some Comment Form villages an additional 3 respondents were 
added, randomly selected from the two neighborhoods above (the 
reasons for this will be discussed below). Each respondent received 
compensation of Rp. 10,000 ($1.20), equal to slightly more than half of 
the typical daily agricultural wage in the study area. 

Given this sample selection, a natural question is whether the 
sample should be re-weighted to reflect the fact that different re¬ 
spondents had different probabilities of being sampled. As is apparent 
from the description of the sampling, women were systematically 
undersampled, and those who attended a pre-randomization village 
meeting were systematically oversampled. In all specifications, I con¬ 
trol for how the respondent was sampled (i.e., whether the respondent 
was from a hamlet with or without the road project, whether the 
respondent was selected from the attendance list at Village Meeting II, 
and whether the household was one of the 3 additional households 
added in the Comment Form villages). The question is whether, given 
these control for level differences among these samples, one needs to 
reweight to account for treatment effect heterogeneity in the relation¬ 
ship between missing expenditures and perceptions. As discussed by 
Deaton (1995), weighting the sample makes the point estimates in¬ 
variant to survey design, but reduces the effective power and, in the 
presence of treatment heterogeneity, does necessarily obtain consis¬ 
tent estimates for the true average population effect. Accordingly, the 
results presented below in the text are unweighted. Using sample 
weights in the regressions that account for the differential probability 
of sampling and which weight each village equally (i.e., so that com¬ 
ment form villages are not over-weighted in the regression) does not 
substantively change the results, but not surprisingly it does reduce the 
statistical significance of the missing expenditures variable in columns 
(2) and (3) of Table 2. 

As discussed above, beyond simply measuring perceptions, an ad¬ 
ditional goal of the survey was to measure how stated perceptions 
about corruption change when respondents know that their answers 
will be used for monitoring. To examine this, after all corruption 
questions except for questions involving corruption in the road project 
had been asked, a randomly selected subset of respondents in the 
Comment Form villages were told that their responses to the questions 
about corruption in the project would be used, anonymously, as part of 
the overall report on the comment forms presented at the account¬ 
ability meeting (the Form B treatment discussed above). Due to a 
training error, approximately 60% of enumerators appear to have 
given Form B surveys to all households in Comment Form villages. In 
the Form B experiment discussed above, I restrict attention to those 
villages where the experiment was carried out properly. Moreover, in 
approximately half of all Comment Form villages, three additional 
households were surveyed, drawn randomly from the same neighbor¬ 
hoods as before, two of whom received Form A and one of whom 
received Form B. In all specifications I include a dummy variable for 
which version of the form each household received in all specifications, 
as well as dummies for whether the household was sampled as part of 
this additional three households per village, in addition to dummies for 
which experimental treatment the village was assigned (i.e. comment 
forms, invitations, or audits). Although I include these dummies in 
all specifications in this paper, doing so does not substantially affect 
the results. 

In addition to the corruption question, the household survey in¬ 
cluded a wide variety of other covariates, such as a household roster, 
education levels, participation in social activities and in the road 
project, assets, and family relationships to various village officials. To 
estimate household expenditure of respondents, I used the 1999 SSD 
(Hundred Villages Survey), an Indonesian statistics bureau dataset, 
containing 3193 rural Javanese households. The SSD asked both a 
detailed expenditure questionnaire and the same set of asset 
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questions used in my household survey. In the SSD, I used OLS to 
estimate the relationship between log household expenditure and the 
following variables, all of which I observe in my survey: log household 
size, whether the household was headed by a woman, the percentage 
of household members consisting of children ages 0-3, 4-6, 7-9,10- 
12, and 13-16, dummies for whether the household has a stove, 
refrigerator, radio, television, satellite dish, motorbike, car, and 
electricity, dummies for floor type, wall type, and ceiling type, the 
total amount of land held by the household, whether the household 
consumes meat at least once a week, whether each household 
member has at least two sets of clothes, whether the household uses 
modern medicine when a child is sick. I then used the estimated 
coefficients from the SSD to predict household expenditure in my 
survey. Combined, these 34 variables have an ^-squared of 0.58 
predicting log household expenditure in the SSD, which suggests that 
predicted expenditure is a reasonable approximation for actual 
expenditure, at least for the purposes used here. 
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